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APPLICATIONS OF PROBABILITY TO MECHANICS 
Br Edwin Bidwell Wilson 

Introduction. The problem of the application of the theory of prob- 
ability to the motion of a dynamical system and in particular to finding, under 
definitely specified assumptions, the average values of the important quantities 
connected with the motion lies quite within the domain of investigations in pure 
mathematics. In a broad way the theory that thus arises may be termed statis- 
tical mechanics; from a narrower point of view, however, statistical mechanics 
may be considered as embracing only so much of the theory as may be built 
upon the special assumptions which lead to the Maxwell law for the distribu- 
tion of velocities and to the theory of the equipartition of kinetic energy. 
Hence it is that the advances in statistical mechanics have been made almost 
exclusively by mathematical physicists at first interested in the kinetic theory 
of gases and subsequently concerned with finding a mechanical foundation for 
thermodynamics, a mechanical formulation of entropy and the law of degrada- 
tion or dissipation of energy.* 

The treatment of these physical analogies requires the systems which are 
taken into consideration to have a very great number of degrees of freedom 
and consequently introduces into the subject of statistical mechanics, in its 
broadest sense, a complexity which is by no means an essential part of it and 
which goes a long way toward making difficult the acquisition of the elements 
of the theory. To set forth, as simply as may be, the essential and elementary 
steps in the application of the theory of probabilities to dynamical systems is 
the object of these pages ; and for this reason the number of degrees of freedom 
of the systems considered will at first be kept as small as possible. The 
amount of mathematics which is required is small and is amply covered by the 
discussions of the application of definite integrals to determining average 
values of quantities continuously distributed, which may be found in the usual 
texts on integral calculus. t 

_____ . — — - — — — — — ■ w— 

* Especial mention may be made of Maxwell, Boltzmann, Planck, Gibbs, Zermelo, O. Rey- 
nolds, and .leans. 

t See Hyerly'H Integral Calculus, chap. 15; Todhmiter's Integral Calculus, chap, 14; 
Williamson's Integral Calculus, chap. 12. 

(129) 
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As it is necessary to have clearly in mind what the term probability 
means when applied to continuously distributed magnitude, it may be well 
to recall the definitions with the aid of an example. Consider what meaning 
shall be attributed to the expression "the average ordinate in a semicircle" 
(which may be assumed to He in the first and second quadrants). To find the 
average of a finite set of quantities q u q.,, • • •, q„ which are repeated respec- 
tively w x , too, • • •, ic n times (the w's are called the weights of the quantities), 
one writes 

W°i + Wh + • • • + q n w n „ Wi w 2 «„ 

* = Wl + u, 2 + . . . + w n =gi ^F i + fJi I^ i + --- + q »T^ i ' 

where q denotes the average. The second form of q is interpretable in terms 
of probability. For there are altogether the number 1iv t of the ^'s and there 
are Wj of the quantities qj ; and hence the probability or chance that a q picked 
out at random is qj is the ratio Wj/ltOi. Thus 

2 = iiih + qiPi + • • • + q n p n 

is also the value of the average, where p t denotes the chance or probability of 
an arbitrary q being q t . 

Now take up the question of the ordinates in the semicircle. There are 
an infinite number of the ordinates and the probability that an ordinate taken 
at random be a specified ordinate is zero. The definition of average as given 
for a finite number of quantities is illusory. To obtain an average it becomes 
necessary to make some specific assumption as to the distribution of the ordi- 
nates. If in the first place we assume that the totality of the ordinates may 
be represented by a finite number n of them which are uniformly distributed 
along the diameter so that there is one ordinate y in eaeh equal interval dx of 
of the diameter or that the number of ordinates is proportional to the length 
of the portion upon which they stand, then the weight to be attached to the 
ordinate or ordinates between x and x + dx will naturally be taken to be dx 
or any multiple kdx of dx and the average will be 

y\dX] + y»dx 2 + • ■ ■ + y„dx n lydx 
J ~~ dxx + dx 2 + • • • + dx„ Idx 

In this manner the infinite number of ordinates is really replaced by a finite 
but very large number which are spaced or distributed in a definite manner. 
To obtain a definite and unique result the number of ordinates may be imag- 
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ined to increase without limit and the individual intervals dx to decrease 
without limit so that the average of the ordi nates is 

f +a 
_ ,. Zydx J_ a ydx /•+« dx nr 

y = hm tsx~ = -j^ =J_ a ^ = i <*' 

where a is the radius of the semicircle. 

It would be possible to make an assumption of distribution which differed 
from the foregoing. We might for instance imagine that the ordinates were 
suspended at equal intervals from the circular arc. In this case it is clear that 
there must be relatively fewer ordinates over a portion of the diameter near 
the center and relatively more over an equal portion near the extremities. In 
fact the weights which are now to be attached are proportional to ds, the ele- 
ment of arc on the circumference. Under this assumption the average of the 

ordinates is 

r+a 

_ ,. lyds l a yds f+ a ds 2 
?/ = hm _r , = ^—gr = / y— = - a. 

J * ds J + "ds J- ™ V 

Thus it is seen how vitally the value of the average of a quantity distributed 
continuously depends on the assumption which is made relative to the distri- 
bution. 

Another point to mention is that the probability that an ordinate lies 
between x and x + dx is the weight divided by the sum (the integral) of the 
weights, that is, it is 

_ , dx dx , T . , ds ds 

Pdx = —^- t = — and Pdx = — — r — — 

ta I + <* 7 va 

ds 



rdx * u r 

J— a J— a 



in the two cases discussed above. It is of course an infinitesimal with dx. 
The quotient of this infinitesimal probability by the differential dx is called 
the probability /unction or the frequency ; thus under the two above assump- 
tions we have respectively 

P = ± and P=±-t = * 



2a "" ira dx ir\/a? — 



x' 
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The only difference between the probability Pdx and the weight is that the 
weight may contain an arbitrary factor of proportionality whereas the proba- 
bility is limited by the relation that the sum of all the probabilities represents 
a certainty, that is, is equal to unity. In this way the value of the factor of 
proportionality is fixed. 

1. Time-averages in a dynamical system with one degree of 
freedom. Consider the motion of the simple pendulum, or better, the motion 
of a cycloidal pendulum which may execute vibrations of any magnitude and yet 
be governed by the equations of simple harmonic motion. Let the mass of 
the oscillating particle * be taken as unity. Then 

d 2 x , dx ,-s 3 . / A 

-Ts- = — n 2 x, v = -j- = n va — x , x = a sin(n< + 7). 

or lit 

AVhat is the probability that the vibrating particle be found between the 
values x and x + dx of its coordinate? The time that the particle spends in the 
interval dx is obtained from its velocity as 

7/ _ ^ x _ dx 



v nya? — x* 

Hence if the natural assumption be made that the probability of the particle 
being in the interval dx is proportional to the time spent in traversing 
that interval, the probability will be 

kdx .. f+ a Mx , 

— , , with / — ===== = 1 

nyV - x 2 J- a ny/a 2 - x* 

as an additional condition due to the fact that the particle must surely lie 
between the limits —a and +a which give the amplitude of the vibration. 
The determination of k gives the value 

1 n 



Jc = 



i: 



+" dx 



i\la? — X" 2 



* We shall speak of a particle merely for the convenience of having a definite con- 
crete conception before us. It should be distinctly noted, however, that the discussion depends 
in no way upon this convention, but holds equally well for any system whose motion is ade- 
quately represented by the equation x = — n*z. 
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This determination of the probability fixes the value of the average of 
quantities connected with the motion. For if u be any quantity of which the 
average is desired, it is merely necessary to form the integral of u times the 
probability over the whole interval. Thus 



f + a n dx 
J- a t n \Ja 2 - 



If it be desired to average u over only a part of the interval, the probability 
for that partial interval need not be computed ; it suffices to regard the proba- 
bility already obtained, or any quantity proportional to it, as a weight and 
write 



i 



'" dx 
u 



£ 






dx 



h Va 2 - ** 
Thus the average value of the square of the displacement is 

"+° xHx 



1 f+ a xHx , , 

•* J-a Va 2 -** 



and the average value of the displacement itself is 

c+ ° xdx 



i.if 



V« 2 - x* 



= 0, 



as was to be expected from the physically obvious fact that the particle 
vibrates in a manner symmetrical with regard to the origin x = 0. 

Again, if it be desired to find the average value of the displacement 
taken without regard to sign, it is possible, in view of the symmetry, to 
take the average of x over positive values and write 



xdx i 



1 f a xdx 

— , "" Jo Va* — 

1 f a dx "" 1 "~ n* 
"Jo Va* - x i 2 
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The same procedure would be possible in case of any odd power of x. In 

a similar manner the average velocity of the particle may be found as 

vdx 



1 r a v 

"Jo y/<P 

i ir Jo "" tt' 



"■ Jo Va 2 - x 2 2» f a 7 2 
v = -r: = — / dx = —na. 



w Jo 



To find the average of the kinetic energy the whole interval may be con- 
sidered. Then 

f& = 2 - f " n* Va* - a" dx = i n% 2 . 

Such averages as these which have been obtained are called lime-averages 
because they ai'e evaluated upon the supposition that the probability or 
weights have been determined by the assumption that the weight is propor- 
tional to the time spent in traversing the interval. 

That the average is obtained on the supposition that the time is the vari- 
able according to which the distribution is uniform may also be seen from the 
formula for the probability. With the time as independent variable the infini- 
tesimal probability would be written as I'dt. Hence it follows that 

r)7 1 dx 1 1 dx n 

Pdt — = , - Tr dt = — dt, 

ir ya 2 — «* *" s/a* — x- ^ v 

and the probability function or frequency appears as the constant n/ir and in- 
dicates uniform distribution with respect to t. The process of finding the 
average of any quantity u may be likened to that of finding the average ordi- 
nate in the semicircle. For if u, which is some quantity connected with the 
motion and hence is a function of t, be plotted as the curve w = u(l) between 
the limits t — — y/n and t = tt/ti — y/n which correspond to a complete vibra- 
tion of the particle from x = — a to x = + a, then the average ordinate of the 
curve will be precisely the time-average of w, provided the average ordinate is 
found on the assumption of uniform distribution of the ordinates relatively to t. 
Frequently, however, it is preferable to follow the vibrating particle in 
space* instead of in time and to consider a quantity u connected with the 
motion as a function u = u(x) of x rather than of t. The infinitesimal proba- 
bility when x is the variable is Pdx and 

Pdx= l - ,_JS!_, p = I x 



ir vV - x 2 ' ir y/a 2 - 



x L 
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Under this assumption the probability function is not constant but is inversely 
proportional to the velocity. From the work done above in determining 
average values it is clear that the time-averages were in realitv obtained with x 
as the variable although t is the variable with respect to which distribution is 
uniform. The average of u is not the average ordinate in the curve u = n(x) 
between the limits x = — a and x = + a when the ordinates are considered as 
distributed uniformly in x ; the density of distribution must be inversely pro- 
portional to the velocity which is a function of x. 

Thus far it has been a single vibrating particle which has been under 
consideration and has been followed whether in time or in space over a com- 
plete swing or part of a swing for the purpose of obtaining the averages of 
quantities connected with its motion. For purposes of generalization it is 
convenient somewhat to change the point of view. Instead of following the 
particle in space (and thus following it implicitly in time) let us suppose that 
at any instant there are an infinite number of particles distributed throughout 
the interval from x = — a to x = + a, each one moving exactly as the single 
particle heretofore considered would move if at the instant considered it 
happened to be passing through the position of that particle. This amounts to 
changing from a single particle with a definite phase-angle 7 to a set or ensem- 
ble of particles with different phase-angles but otherwise entirely alike. Further- 
more let the density D of the distribution of particles through the interval be 
proportional to the probability function so that 



Z>ocP = l 



w )/a? — x i ' 

Now it appears at once that for the average of any quantity u taken for all the 
particles of the ensemble the weights will be Ddx^ and hence that 

*0£ course it would, strictly speaking, be impossible to distribute the particles contin- 
uously. This, however, is a difficulty which arises in any treatment of a body as a continuous 
distribution of matter or mass. The density here is a linear density and the number of par- 
ticles or the amount of matter in the interval dx is Ddx as in the ordinary treatment of a chain 
or rod. 

t For Ddx represents the amount of matter or number of particles in the interval dx. 
The conception is entirely similar to that previously mentioned in connection with the fact that 
Pdx was the number of ordinates in the interval dx when P was the frequency or density of 
the ordinates. 



u = 
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/+a r+a 
uDdx I uPdx +a 
— £- = J=£- = / uPdx (since D x P) 

r+a r+a I » / 

/ Ddx I Pdx J - n 

J— a J— a 

and that in like manner the average taken over a part of the ensemble will be 

/$ ft 

uDdx I uPdx 
Ja 



U = 



J uDdx j Pdx 



From the above formulas it is seen that the average values of w taken 
over the ensemble or any part of it are identical with the corresponding aver- 
ages found by following a single particle. This identity is not fortuitous but 
arises necessarily from the law of density assumed for the distribution of the 
particles in the ensemble. The advantages of passing from time-averages to 
ensemble-averages will appear more clearly in the next section. One point 
should be noted here. The density of the distribution of me particles has 
been assigned at a specified instant. From that instant on, each particle will 
execute a simple harmonic motion and therefore at any subsequent instant the 
density of the distribution will be determined by the original distribution and 
by the motion of the individual particles. It is extremely important, in fact 
it is essential, that the density of distribution shall not change with the time ; 
otherwise if the ensemble-averages were taken at a later time, the averages 
would not in general be identical with the time-averages. 

Suppose that D is the density of distribution of particles in the en- 
semble. Consider any interval dx between x — — a and x — -f a. The number 

/■x+dx 
DBx,* and it is desired to express the fact or 

the condition that there is no change in this number owing to the motion of 
the particles, that is, that as many particles are brought into the interval dx 
as are removed from it during each interval of time. For convenience let the 
interval of time St be chosen as small, small even relatively to the time that 
it takes a particle to move over the interval dx, and let the velocity of the 

* The symbol Sx Is used in place of dx in the integrand to avoid confusion with the total 
interval dx. 
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particles be positive. In the time St the particles which are in the neighbor- 
hood of the point x have the velocity v and will move through the 
distance Sx = vSt . The number of particles in the interval Sx = vSt just to 
the left of x is DSx = DvSt and it is this number which will advance into 
the interval dx in the time St . The velocity of particles in the neighborhood 
of x 4- dx is v + dv and the distance they will move is 6"x = (v + dv)St. 
The number of particles in the interval S'x = (v + dv)St just to the left of 
x + dx is (D + dD) S'x = (D + dD) (v + dv)St and it is this number which 
will advance out of the interval dx in the time St. Hence if the number in the 
interval is to remain constant, it is necessary and sufficient that 

{D + dD) (v + dv)St - DvSt = 0. 

The condition thus obtained may be simplified by discarding St, by 
expanding and canceling Dv, and by omitting the infinitesimal dDdv of the 
second order. The result is the differential equation 

vdD + Ddv = 0, of which Dv = constant 

is the integral. It therefore appears that the necessary and sufficient condi- 
tion that the distribution of the particles remain unchanged owing to their mo- 
tion is that the density be inversely proportional to the velocity. As this was 
the law assumed above, it is clear that that distribution was permanent in time. 
Such a distribution is said to be in statistical equilibrium. The result Dv = 
const, just obtained depends only on the facts that the particle has one degree 
of freedom and that the velocity is a function of the coordinate specifying the 
position of the particle.* In such mechanical systems the only possible distri- 
bution in statistical equilibrium is that in which the density is inversely 
proportional to the velocity, and averages taken over the ensemble thus distri- 
buted are identical with the time-averages found by following a given particle 
throughout its motion. 

2. Ensemble-averages in a dynamical system with one degree 
of freedom. In the foregoing section it has been assumed that a definite 
particle was under consideration and that consequently the amplitude a of the 
vibration was known. To know the phase-angle was unimportant because the 
particle was followed throughout a vibration. The ensemble which was intro- 
duced was obtained by disregarding the phase-angle and assuming an infinity of 

* It should be noted tlmt the conception of statistical equilibrium as set forth in these 
paragraphs requires the motion to be conservative. 
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particles all with the same amplitude (and consequently with the same 
energy) but distributed over all possible phase-angles. This means that one 
constant of integration, the constant 7, has been regarded as unknown and that 
a distribution relative to it has been introduced to make up for the lack of 
knowledge. Suppose next that all that is known about the particle is that it 
satisfies the differential equation of simple harmonic motion with a given value 
for n. The initial conditions which determine the constants of integration a 
and 7 are unknown ; no particular particle can be followed in its motion. It 
is necessary to replace the particle by an ensemble in which distribution with 
regard to both constants of integration is allowed. This ensemble will be a 
two-dimensional ensemble, not a one-dimensional ensemble as before. 

The initial conditions which determine the motion of any particular 
particle are the position and the velocity at any instant ; these may be inter- 
preted as the coordinates of a point in an auxiliary x-u-plane. Thus at the 
given initial instant there will correspond to each possible motion regulated 
by the given differential equation a definite point of the x-v-plane, and con- 
versely to each point of the plane (possibly only within certain restricted 
regions of the plane) there will correspond some particular motion with speci- 
fic amplitude and phase-angle. The point (x, v) will be called the represen- 
tative point of the motion- As time goes on, the representative point for 
any particular motion will change from its initial position and will describe 
a curve in the x-v-plane : for both the position and the velocity of a given 
particle are functions of the time. In the case of simple harmonic motion 
these curves will be the ellipses n-x 1 + v*' = w 2 o 2 . Nothing as yet has been 
said as to the density assumed for the distribution of the representative points- 
in the x-v-plane at the initial instant.* That requires careful consideration. 

In the first place let it be noted that what the density will ultimately be 
used for is to determine average values over the ensemble according to the 
rule 



//■ 



Ddxdv 



II 



Ddxdv 



* It should be remarked that here we are speaking of the density of representative points 
and are not thinking of the density of the particles as in the previous section. In fact it might 
have been better though less concrete to have spoken in the first instance of representative 
points and of their density of distribution rather than of the particles themselves. 
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where w is the quantity to be averaged, where Ddxdv is the product of the 
density by the element of area and consequently represents the number of 
motions whose representative points lie in that element, and where the double 
integrals may be extended over any region of the x-v-plane which contains the 
points representative of those motions for which the average is desired. If 
the density D of the distribution at the initial instant were assumed at random, 
the motion of the representative points would in the course of time bring 
about a new distribution which in all probability would differ from the origi- 
nal distribution ; the averages at the later instant would then probably differ 
from the original averages. To avoid these inconveniences it is desirable to 
impose upon the density D the restriction that it shall represent a permanent 
state, that is, that the ensemble of representative points shall be so distributed 
as to be in statistical equilibrium.* 

The conditions for statistical equilibrium are not hard to find. The most 
general assumption for the density D would be that D is a function D(x, v, t) 
of the position, velocity and time. Now if the distribution is permanent, the 
density cannot depend explicitly on the time, and dDjdt = 0. At a given 
time the number of representative points within a given rectangle JR of the 
sc-v-plane between x and x + dx and v and v + dv is 

Cx + dx fv -J- dv 



rx + dx fv -J- dv 

/ / D(x, v)SxSv. 

Jx Jr 



The number of points can change only by some of the points moving across 
one of the four sides of the rectangle. As Sx = vSt, the points which will 
move into i? across the left hand side of length dv are those situated in the 
small rectanglef of area Bxdv just to the left of R ; their number is Dvhtdv. 
The points which will move out across the right hand side of length dv at x + dx 
must be those in the small rectangle of area B'xdv just to the left of that side 
and, as v is here one of the independent variables and is constant along lines 

* Attention may again be called to the fact that the conception of statistical equilibrium 
requires the motion to be conservative ; for if the motion were dying out, it would not have the 
necessary characteristic of permanency in time. 

t The same assumption as before relative to the magnitude of St will be made, namely 
that St is small compared with the time it would take one of the points to move across the rec- 
tangle B. Moreover, owing to the smallness of Sx as compared with dv or dx, the number of 
points which might move out or in across the short sides Sx of the small rectangle may be 
neglected. 



( 
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parallel to the x-axis, B'x = Bx = vBt ; the number of the points is therefore 

D + -^- dx\vBtdv. As Bv = /Bt, where / denotes the acceleration which 

is given by the differential equation as a function of x, the points which will 
move into R across the lower side of length dx at v are those in the small reo- 
tangle of area Bvdx just below H; their number is D/Btdx. The points 
which will move out across the upper side of length dx at v + dv are those in 
the small rectangle of area Svdx just below that side and, as/ is a function 
of x alone and is constant along lines parallel to the v-axis, B'v — Bv ^/Bt; 

the number of the points is therefore (l) + — dvj/Btdx. Hence, if the 

total number of points is to remain unchanged, 

DvBtdv ~(d+~ dx\vBtdv + D/Btdx - (d + ^ dv\ fBtdx =* 

oD ZD . A 

or -s— v + -x- / = 0. 

dx dv J 

This is the condition for statistical equilibrium which was desired. It is there- 
fore seen that D must satisfy a definite partial differential equation in x and v. 
The solution of the partial differential equation just found depends upon 
the solution of the simultaneous set of equations* 

dx dv dl) 

__ = _ = __. 

The solution of the first two equations may be obtained at once : 

fdx = vdv or uifdx — mvdv or Fdx = mvdv 

if the mass of the moving particles is m and F is the force acting on it. The 
equation may be simplified by writing F = — d V/dx, where V is the poten- 
tial energy (the motion is assumed to be conservative) . Then 

— dV= ddmv 2 ), and finally ^mv* + V = const. 

* It may be noted that if these equations be written in parametric form as 

dx dv dt 

n ~y ~ t 

they define the path of the representative point in the x-v-plane. 
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Hence the value of D is D = ft (const. ) = H ( E) , where E is the total energy 
and fi is any arbitrary function. It therefore is seen that the condition for 
statistical equilibrium in the distribution of representative points for possible 
motions of a conservative dynamical system of one degree of freedom is that 
the assumed density should be a function of the total energy. 

The ensemble-averages will naturally depend upon the choice of the ar- 
bitrary function O. Suppose that (in the special example under considera- 
tion where f=— n*x and msl) this is taken as 

D = Cl{E) = ke-* E = ke-o'W + W'K 

where A; and c are constants ; and let it be assumed that any values are per- 
missible for the velocity and position of the particle so that the representative 
points are spread all over the x-v-plane. The mean square of the displacement 
x will then be 

n '™ x-e-^^' + ^^dxdv 



„,•! J— <*> J — 



X* = 



r ,*, I '' X e _ c , av , + l „,^ dxdv 



rr 

J— r> J— e 



The form of the integrals is such that the integrations with respect to x and v 
are quite independent. Hence, discarding the integration with respect to v 
which introduces merely a factor in numerator and denominator, the result 
may be written as 



X 



+ 70 C 1 , 



Li**"*** w^ 

The average value of the potential energy would be l/2c*. It may readily be 
shown that the average of the kinetic energy would also be l/2e 2 . The average 
displacement would of course be zero. To find the average of the magnitude 
| x | of the displacement it would merely be necessary to evaluate 



♦By the ordinary methods of evaluating these integrals or by B. O. Pierce's Short Table of 
Integrals, formulas 492 and *94 of the revised edition. 
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nxe~ «■<»•• + *»'*'> dxdv 
iXl ~ - 



1 /2 



Jo 



e-c'ci" +!«•«•) dxdv 



Other averages with this assumption for the density of the distribution and for 
the extent of the distribution of the representative points in the plane may be 
obtained if desired. 

To connect the present discussion where the energy is allowed to have 
different values with the preceding case in which the energy was fixed, it is 
merely necessary to consider those motions whose representative points lie be- 
tween the curves E = C and E = C + dO of equal energy. In the above ex- 
ample these are similar and similarly situated ellipses. If x and E instead of 
x and v are taken as the independent variables, the density of distribution 
takes the form 

DdxdE = n(E) |l dxdE = 9^1 dE —, 
v ' dE in v 

where m is the mass of the particle ; and hence it appears that if the energy 
be held constant the density regarded as a function of x is inversely propor- 
tional to the velocity. This is in accord with the results of the last section, 
as was to have been expected. For different energies the density is no 
longer proportional to the velocity because the factor il (E) is different for 
different values of E. 

Once the density is fixed as CI (E) , the probability function which is to 
be taken proportional to the density becomes P= k £l(E). If the limits of 
the region throughout which the representative points are assumed to be spread 
are also assigned, the value of k may be determined by the relation 

f fktl(E)dxdv = 1 or f /" jjj. ^^ dxdE = 1, 

where the double integral is extended over the entire region occupied by the 
representative points : for the total probability is 1 . It is thus seen that where- 
as in the simplest case where there is a constant energy it is most natural to 
start with the conception of the probability and the time-average and then to 
proceed to the ensemble and to its density of distribution, in the case of vari- 
able energy it is simpler and almost necessary to start with the ensemble in 
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statistical equilibrium and proceed to the probability. And whereas the time- 
averages and ensemble-averages were identical in the first case, they are dis- 
tinct in the second — or better, one might say that the conception of a time- 
average implied the following of a definite motion or at most a group of motions 
which differed only by an additive constant for the time and that therefore the 
conception became illusory in the second case and we were forced to use en- 
semble-averages . 

3. Systems with two degrees of freedom. Further to investigate 
the advantage or necessity of having recourse to the ensemble in statistical 
equilibrium consider what happens when averages are desired for a particle or 
system with two coordinates. To exhibit some of the points at issue it will be 
sufficient to consider the simple case of harmonic motion relative to two per- 
pendicular axes and defined by the equations 

cPx „ d*y . dx , dy . 

w - - •*"» di - - m * it - "^ - * 2 ' di = m & - y 2 ' 

x = acos(nt + 7), y = 6 sin (ml + 8), T = J(«*a 2 + >»*6* — n*x 2 — w*y*). 

As the mass of the particle has been taken as unity, we have v = ^2 T and the 
time required for the description of the arc ds is 

ds = s/dx* + dif 



dt = 



y/n 2 a 2 + mW - n?x 2 - mhf ' 



If as in the first section the probability be considered as proportional to this 
time, there results 

p dt kd* f kds ^ 



where the integral extends over the entire length of the curve described 
by the particles. 

Now if the constants n and m are commensurable, the particle will 
describe a closed path, the integral which must be evaluated to determine k 
will converge, and the value of k may be found. Then the time-averages of 
any quantities connected with the motion could be found as in the first 
section. But if m and n are incommensurable, the particle will describe an 
endless Lissajous' curve, the integral will diverge, the determination of k will 
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be impossible, and the whole problem will become illusory. It appears, 
however, that the particle after a sufficient lapse of time may be found as 
near as desired to its original position and with its velocities v x and v u as near 
as desired to their original values. By disregarding these small differences the 
path may artificially be considered as closed, a value of k may be computed 
on this assumption, and the averages may be "found. This method of dealing 
with the problem is possible and might readily be adopted from a mathemat- 
ical point of view and would undoubtedly give satisfactory physical results. 
But the method which actually is adopted is to pass over to the consideration 
of ensembles distributed in statistical equilibrium. This will now be 
explained. 

Four initial conditions are required to specify the motion of the particle, 
namely, the initial values of x, y, v x , v r Let these four quantities be inter- 
preted as rectangular coordinates in a space of four dimensions, thus obtaining 
in that space a representative point for the motion. Each point of the space 
will correspond to some motion* and every motion will have a definite repre- 
sentative point at the initial instant. As time passes, the representative points 
will describe paths of one dimension, that is, curves in the representative space 
of four dimensions. Everything is quite similar to the simpler case where 
there was only one coordinate and the representative space had two dimensions. 
Here the most general assumption for the density would be that it was a 
function D (x, y, v x , v ir t) of the five variables which enter into the descrip- 
tion of the motion. If there is to be statistical equilibrium, that is, if the 
density is not to depend on the time, considerations like those previously 
advanced show that D cannot depend explicitly upon t and must satisfy the 
partial differential equation 

dD oD dD , dD , _ 

te V * + Jy- V » + W x f * + ^/» = °- 

The details of the deduction may be omitted. 

The solution of this partial differential equation depends on the solution 
of the sj'stem of ordinary equations 

dx dy dv x dv y dD 

v x v u ~ f x ~ f„ ~ 

* There may be need of restricting the range of representative points to some region of 
the four-dimensional space ; for the energy must be real and moreover it may be desirable to 
restrict our discussion to only some of all the possible motions. 
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There will be three independent integrals of these equations and the density 
D will be a function, an arbitrary function, of the three. In the particular 
case considered above the equations become 

dx dy dv x dv u dD 

v x v u ~~ ri-x ~ nfty ' 

and the integrals are readily seen to be 

nW + v% = 2c y , »»V + tf = 2c,, 

1 , nx 1 my 

- cos - * , cos - l * - = c. ; 

n S/tfx 2 + v% m y/ m iyi + ^ 

and hence if the distribution is to be in statistical equilibrium it is necessary 
and sufficient that the density be 

D = £l(c u c 2 , c 3 ). 

It will be noticed that the sum of c x and c 2 is the total energy E and the 
density may therefore also be taken as 

D = D.(E, c x — c„ c g ). 

Averages over the ensemble are obtained by quadruple integration 

I I I I uDdxdydv x dv y 
I Ddxdydv x dv u 

extended over such four-dimensional regions as contain the representative 
points of the motions for which the average is desired. 

It should be noted that the three integrals c 1( c„ c 3 , are invariants for 
the motion of any given particle, that is, are constant in time when an\' partic- 
ular motion is followed. For they satisfy the above set of simultaneous 
ordinary equations and hence also the partial differential equation which D 
satisfies. That is, 

dc. dc, dc, „ 8e, . . 

V- *>x + -ST 2 V„ + r-i fx + ^ fv = 0, 

dx dy y dv x JX dv,/ v 

and so for the other two. But as c u and also the others, do not contain the 
time explicitly, it follows that dcjdt = and hence 
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dc t gq dc x dc x fci dc! 

dt = -dt + te v * + dj v » + w x fx + sr/> - °' 

which shows that the total derivative of Cj with respect to the time is 0. 
It is likewise evident that any function of these three invariants is an inva- 
riant of the motion. Moreover there could not be a fourth invariant which 
was functionally independent of these three. For if there were four invariants 
c i> C2> c 3> c * which were functionally independent, they could be solved for 
the four arguments x, y, v x , v y ; they would therefore entirely fix the repre- 
sentative point, and there would be no possibility of passing from one such 
point to another in following the motion. It is clear that apart from the 
particular case of harmonic motion there will be for the motion of any system 
with two degrees of freedom three and only three functionally independent 
invariants. If there is statistical equilibrium, the density must be a func- 
tion of these three quantities and as the energy is surely an invariant of the 
motion, it may be taken as one of the three. 

An interesting connection with the theory of time-averages may now be 
made. For, suppose that the density is chosen as a function of the energy 
alone, as D = fl (E) , and let the energy be taken as one of the independent 
variables replacing v x . Then the density may be written in the form (if it 
be assumed that the mass is unity) 

Ddxdydv y dE = Cl(E) — dydv v dE = £l(E)dydv v dE — * 

Now consider all the motions whose representative points lie between the two 
surfaces (of three dimensions) E — C and E = C + dC in the representative 
space of four dimensions. For these motions €1 (E) has the sensibly con- 
stant value fl ( C) and hence it appears that the density is inversely propor- 
tional to the velocity with a constant factor of proportionality fl ( C) . As 
the probability is taken as directly proportional to the density, it is seen that 
the probability is proportional to the time of describing the arc ds. Hence if 
averages be obtained by integration over the region between the surfaces E = 
C and E = C + dC, those averages may be considered as time-averages, 
and are generally so considered although it is clear that motions whose 
representative points lie between these surfaces are by no means so restric- 

• For ~ = - and dx : <Jdx* + dy* = vs. \fz? + V . 
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ted as to differ merely by an additive constant for the time. Thus it appears 
that the introduction of statistical equilibrium and ensemble-averages has led 
naturally to a very considerable entension in the idea of a time-average, an 
extension which at first sight might have seemed to be contradictory to any 
notion of time-average. 

Conclusion. The foregoing discussion has been presented as an intro- 
duction to the general theory of statistical mechanics. It is not intended to 
present that theory here. Readers may refer to such portions of Gibbs's trea- 
tise on the subject* as may interest them. It may, however, be stated that 
the subject is concerned not so much with the physical system which is exe- 
cuting the motion as with the differential equations of the motion. The word 
particle has been employed so largely merely for definiteness. The work of 
the text which was connected with the equation cftx/dP = — n 2 x would apply 
equally well to the vibrations of a torsion pendulum where x represented angu- 
lar rotation and v represented angular velocity. The problem of article 3 
could be interpreted as being that of the vibrations of any system depending 
on two generalised normal coordinates. Under this class of problems would 
come the case of a stretched massless string carrying two particles constrained 
to move in a plane and in lines perpendicular to the position of equilibrium of 
the string, f 

In considering the motion of any conservative system of which the position 
is specified by n generalised coordinates, it is customary to introduce the gen- 
eralised momenta in place of the generalised velocities. If q u q%, • • • , q n are 
the generalised coordinates, Tthe kinetic energy, and V the potential energy, 
the generalised momenta are 

ar ar zt 

by definition. The equations of motion are then 

_ d(T-V) _ HT-V) 
Pl -—^-'---' Vn dqV~ 

* Elementary Principles in Statistical Mechanics by J. Willard Gibbs, 1901. 

t This problem is frequently treated as a preparation for considering the string as loaded 
■with n particles where n is allowed to become infinite and thus lead to the vibrations of a violin 
string according to the method used by Lagrange. 



148 WILSON 

in the Lagrangian form, or in the Hamiltonian form they are 
d(T+V) . d(T+V) . . . 

There are 2« initial conditions corresponding to the initial values of the 2n 
coordinates q and momenta^. These 2n quantities may be regarded as the 
coordinates of a representative point in 2n dimensions. The condition for 
statistical equilibrium is then that the density D of the distribution of these 
representative points over the auxiliary space of 2n dimensions shall satisfy 
the partial differential equation 

dD . oD . dD . dD . A 

dq x dpi dq n dp n 

The solution of this equation for D depends on the solution of the set of 
simultaneous equations 

dqx dpx _ _ dq n dp n dD 

lii ~~ lh ~ ~ ~q~*~ ~p\ ~ '0~' 

The total energy E = T+ Fis one integral of these equations and there are 
2>t — 2 other integrals C 1( C t , • • •, C 4l) _ 2 - The density D is therefore any 
arbitrary function fl of these quantities, 

D = £l(E, Ci, C 2 , • ■ •, C in _ 2 ). 

It is generally customary to consider the density as a function of the energy 
alone. The particular function which Gibbs uses and which, under the as- 
sumptions which he makes, establishes the principle of the equipartition of 
kinetic energy, is 

D ac P = e e , where ® and ^r are constants, 

and the distribution is then called canonical. In the particular case where 
the attention is concentrated on the motions whose representative points lie 
between two adjacent surfaces E = C and E = C + dC of constant energy, 
the distribution is called microcanonical. The advantage of discussing the gen- 
eral canonical case first and the microcanonical case second has been sufficient- 
ly illustrated by the treatment of the problems in the text. 

Massachusetts Institute of Technology, 
Januaby, 1909. 



